Skip to main content

Prompt for Visual Language Models to extract content from images by describing them

Please analyze this image comprehensively and provide the following information:

  1. General Overview:
    • Main subject matter and content
    • Overall composition and context
    • Key visual elements present
  2. Text Content:
    • Transcribe any visible text accurately
    • Include headers, labels, and captions
    • Note any important text formatting or emphasis
  3. Data Visualization (if present):
    • For tables:
      • Convert to markdown format
      • Preserve column headers and data relationships
    • For charts/graphs:
      • Describe the type of visualization
      • Explain key trends and patterns
      • List important data points and values
    • For diagrams/flowcharts:
      • Explain the structure and relationships
      • Describe the flow or process
      • Note any important symbols or annotations
  4. Additional Details:
    • Identify any branding or logos
    • Note color schemes if significant
    • Describe any relevant metadata or context

Please format the response clearly and maintain the original structure of any data.

Ref

1: A Dead Simple Way to VLM Parsing, Zotero